HOME BLOG GITHUB
Single Precision Numbers Fractional Numbers Fractional Numbers Single Precision Numbers Single Precision Numbers Fractional Numbers->Single Precision Numbers Double Precision Numbers Double Precision Numbers Fractional Numbers->Double Precision Numbers

November 15, 2020

Single Precision Numbers

Storing numbers with comma means translating into binary without taking too much space.

Single Precision Numbers

On 32 bits, we divid the bits with 1 sign bit s, 8 exponent bits e and the remaining 23 bits for the fractional part:

The formula for decoding a 32-bit floating point number is as follows:

$$n_(10) = (-1)^s * 2^e * ( 1 + \sum_i b_(23-i) * 2^(-i))$$

where n_(10) is the resulting decimal number, s is the sign bit (most significant bit), e is the decimal value corresponding to the 8 exponent bits and b_i are the bits number i.

Sign bit

The most significant bit (bit 31) is the sign bit. 0 means we encoded a positive number, and 1 is negative.

Exponent encoding

The exponent e is not encoded using the two's complement representation, but with a different one: the offset-binary representation with the zero offset being 127. This means that 0000 \, 0000_2 represents -126, 1000 \, 0000_2 represents 0 and 1111 \, 1111_2 represents 127.

Fraction encoding

The fractional part of the number is encoded with standard binary encoding. There is a simple method to convert a decimal fractional part into binary:

  • multiply by two * take the integer part (either 0 or 1) which will be the binary bit number -1 (bit number 22 in our 32-bit floating-point encoding) * multiply the fractional part of the number obtained by 2 * repeat for bit number -2 ... -22 (bits 21 to 0 in 32-bit floating-point encoding)

For example, for 0.345:

Multiply by 2 Integer part Fraction part Bit number in 32-bit representation
0.345 * 2 = 0.690 0 0.690 22
0.690 * 2 = 1.380 1 0.380 21
0.380 * 2 = 0.760 0 0.760 20
0.760 * 2 = 1.520 1 0.520 19
0.520 * 2 = 1.040 1 0.040 18
0.040 * 2 = 0.080 0 0.080 17
.. .. .. ..
0.880 * 2 = 1.760 1 0.760 0

Range and Precision

The fractional part is stored with 23 bits. This allows a precision of between 7 and 9 significant digits (2^(23) = 8 \, 388 \, 608). The exponent is stored on 8 bits, which allows numbers from 2^(-126) \approx 1.175 * 10^(-38) to 2^(127) \approx 1.701 * 10^(38).